OcrV1, Main, Exploration, bibRecord, 000909

Set Intersection and Sequence Matching

Identifieur interne : 000909 ( Main/Exploration ); précédent : 000908; suivant : 000910

Set Intersection and Sequence Matching

Auteurs : Ariel Shiftan [Israël] ; Ely Porat [Israël]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2009.

RBID : ISTEX:D11060EDE7AEBA4F13AD081893075264748C74CE

Abstract

Abstract: In the classical pattern matching problem, one is given a text and a pattern, both of which are sequences of letters, and is required to find all occurrences of the pattern in the text. We study two modifications of the classical problem, where each letter in the text and pattern is a set (Set Intersection Matching problem) or a sequence (Sequence Matching problem). Two “letters” are considered to be match if the intersection of the two corresponding sets is not empty, or if the two sequences have a common element in the same index. We show the first known non-trivial and efficient algorithms for these problems, for the case the maximum set/sequence size is small. The first, randomized, that takes $\Theta\left( 2^dn\ln n\log m\right)$ time, where d is the maximum set/sequence size, and can also fit, with slight modifications, for the case one is also interested in up to k mismatches. The second is deterministic and takes $\Theta\left( 4^{d}n\log m\right)$ . The third algorithm, also deterministic, is able to count the number of matches at each index of the text in total running time $\Theta\left( \sum_{i=1}^{d} {|\Sigma| \choose i} n\log m \right)$ .

Url:

https://api.istex.fr/document/D11060EDE7AEBA4F13AD081893075264748C74CE/fulltext/pdf

DOI: 10.1007/978-3-642-03784-9_28

Affiliations:

Israël

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 002143
to stream Istex, to step Curation: 002001
to stream Istex, to step Checkpoint: 000431
to stream Main, to step Merge: 000917
to stream Main, to step Curation: 000909

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Set Intersection and Sequence Matching</title>
<author><name sortKey="Shiftan, Ariel" sort="Shiftan, Ariel" uniqKey="Shiftan A" first="Ariel" last="Shiftan">Ariel Shiftan</name>
</author>
<author><name sortKey="Porat, Ely" sort="Porat, Ely" uniqKey="Porat E" first="Ely" last="Porat">Ely Porat</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:D11060EDE7AEBA4F13AD081893075264748C74CE</idno>
<date when="2009" year="2009">2009</date>
<idno type="doi">10.1007/978-3-642-03784-9_28</idno>
<idno type="url">https://api.istex.fr/document/D11060EDE7AEBA4F13AD081893075264748C74CE/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002143</idno>
<idno type="wicri:Area/Istex/Curation">002001</idno>
<idno type="wicri:Area/Istex/Checkpoint">000431</idno>
<idno type="wicri:doubleKey">0302-9743:2009:Shiftan A:set:intersection:and</idno>
<idno type="wicri:Area/Main/Merge">000917</idno>
<idno type="wicri:Area/Main/Curation">000909</idno>
<idno type="wicri:Area/Main/Exploration">000909</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Set Intersection and Sequence Matching</title>
<author><name sortKey="Shiftan, Ariel" sort="Shiftan, Ariel" uniqKey="Shiftan A" first="Ariel" last="Shiftan">Ariel Shiftan</name>
<affiliation wicri:level="1"><country xml:lang="fr">Israël</country>
<wicri:regionArea>Department of Computer Science, Bar-Ilan University, 52900, Ramal Gab</wicri:regionArea>
<wicri:noRegion>Ramal Gab</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Israël</country>
</affiliation>
</author>
<author><name sortKey="Porat, Ely" sort="Porat, Ely" uniqKey="Porat E" first="Ely" last="Porat">Ely Porat</name>
<affiliation wicri:level="1"><country xml:lang="fr">Israël</country>
<wicri:regionArea>Department of Computer Science, Bar-Ilan University, 52900, Ramal Gab</wicri:regionArea>
<wicri:noRegion>Ramal Gab</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Israël</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2009</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">D11060EDE7AEBA4F13AD081893075264748C74CE</idno>
<idno type="DOI">10.1007/978-3-642-03784-9_28</idno>
<idno type="ChapterID">28</idno>
<idno type="ChapterID">Chap28</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: In the classical pattern matching problem, one is given a text and a pattern, both of which are sequences of letters, and is required to find all occurrences of the pattern in the text. We study two modifications of the classical problem, where each letter in the text and pattern is a set (Set Intersection Matching problem) or a sequence (Sequence Matching problem). Two “letters” are considered to be match if the intersection of the two corresponding sets is not empty, or if the two sequences have a common element in the same index. We show the first known non-trivial and efficient algorithms for these problems, for the case the maximum set/sequence size is small. The first, randomized, that takes $\Theta\left( 2^dn\ln n\log m\right)$ time, where d is the maximum set/sequence size, and can also fit, with slight modifications, for the case one is also interested in up to k mismatches. The second is deterministic and takes $\Theta\left( 4^{d}n\log m\right)$ . The third algorithm, also deterministic, is able to count the number of matches at each index of the text in total running time $\Theta\left( \sum_{i=1}^{d} {|\Sigma| \choose i} n\log m \right)$ .</div>
</front>
</TEI>
<affiliations><list><country><li>Israël</li>
</country>
</list>
<tree><country name="Israël"><noRegion><name sortKey="Shiftan, Ariel" sort="Shiftan, Ariel" uniqKey="Shiftan A" first="Ariel" last="Shiftan">Ariel Shiftan</name>
</noRegion>
<name sortKey="Porat, Ely" sort="Porat, Ely" uniqKey="Porat E" first="Ely" last="Porat">Ely Porat</name>
<name sortKey="Porat, Ely" sort="Porat, Ely" uniqKey="Porat E" first="Ely" last="Porat">Ely Porat</name>
<name sortKey="Shiftan, Ariel" sort="Shiftan, Ariel" uniqKey="Shiftan A" first="Ariel" last="Shiftan">Ariel Shiftan</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000909 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000909 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:D11060EDE7AEBA4F13AD081893075264748C74CE
   |texte=   Set Intersection and Sequence Matching
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Set Intersection and Sequence Matching

Set Intersection and Sequence Matching

Source :

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri